home *** CD-ROM | disk | FTP | other *** search
-
- >Three questions,
- >
- > 1) If we now expect quotes around tags, are we still meant to understand % as
- > an escape character within tags?
-
- In short, I think so.
-
- These dang things get parsed twice: once by the SGML parser, and once
- by the URL parser.
-
- After the HREF=, the SGML parser is looking for an attribute value,
- which may be a token or a literal. The syntax of a URL conflicts with
- the syntax of a token, so you've got to use a literal, i.e. you've
- got to put quotes around it.
-
- To compute the value of the HREF attribute, the SGML parser grabs
- everything between ""s (or ''s, actually. In fact, it expands
- &entity; references too!).
-
- Then you hand the value of the HREF attribute to the URL parser.
- It better be a legal URL at this point. I don't know if the URL
- parsing code can handle spaces in a URL or not. If not, they've
- got to be represented by the %nn construct.
-
- NOTE: There's an SGML construct: SPACE; or { designed for the same
- purpose. We might want to remove the quoting mechanism from the
- URL spec, and say that you use whatever quoting mechanisms the
- enclosing data format requires.
-
-
- > 2) Which of the following do I need to support, and which is the "approved"
- > method of accessing gopher?
- >
- > href="gopher://gopher.micro.umn.edu:70/00/Some Stuff"
-
- This is legal SGML -- dunno if it's a legal URL.
-
- > href="gopher://gopher.micro.umn.edu:70/00/Some%20Stuff"
-
- This is probably your best bet for the current linemode code.
-
- > href=gopher://gopher.micro.umn.edu:70/00/Some%20Stuff
-
- SGML parsers won't grok this.
-
- For starters, you've got kind of a bad design for handling SGML
- attributes: you parse them twice: once to stick them in the param
- resource, and once to take them out of the param resource and stick
- them in the href and name resources.
-
- Rather than a param resource, the parsing code should build an XtArglist
- with the attribute names and values. Then it can just call XtSetValues
- when it's done parsing the start tag. This would be a minor modification
- to my current version of the MidasWWW code using my HTML parsing library.
-
- > 3) Is the % meant to act as an escape character in search strings? ie
- >
- > href="http://slacvm.slac.stanford.edu/FIND/PARTICLE?PI%nn"
- >
- > meant to find entries for PI+ ? (where nn is the ascii code for +).
-
- Yeah... I've got a bunch of questions like this one. My understanding
- is that everything after the scheme: is defined by the individual scheme.
- It's not safe to just replace %nn by the corresponding ASCII character
- in all URLs. The %nn quoting mechanism is specific to the gopher scheme.
- (It might be used by other schemes too, but it's not a universal mechanism.)
-
- I've got some design ideas for the WWW library that I think would obviate
- the need for implemntors like Tony to even mess with this stuff.
-
- Details as the develop...
-
- Tony: I'll send you my HTML parsing work separately.
-
- Dan
-
-
-